# RLHF optimization
Fsfairx Gemma2 RM V0.1
A reward model based on the Gemma-2-9B architecture, trained using RLHF workflow, suitable for dialogue and reasoning tasks.
Large Language Model
Transformers

F
sfairXC
51
7
Norgpt 3B Rfhl Summarization
A text summarization model fine-tuned on Norwegian news summarization datasets using RLHF strategy based on NorGPT-3B model
Text Generation
Transformers Other

N
NorGLM
56
0
Distilroberta Base Rejection V1
Apache-2.0
A text classification model fine-tuned based on distilroberta-base, used to identify rejection responses generated by large language models
Text Classification
Transformers English

D
protectai
74.91k
7
Starling LM 7B Alpha
Apache-2.0
The first open-source large language model trained with AI Feedback Reinforcement Learning (RLAIF), demonstrating excellent performance in MT Bench tests
Large Language Model
Transformers English

S
berkeley-nest
9,765
558
Xwin LM 70B V0.1
Xwin-LM is a powerful language model based on Llama2, specializing in large language model alignment techniques, with outstanding performance on the AlpacaEval benchmark.
Large Language Model
Transformers

X
Xwin-LM
1,161
214
Featured Recommended AI Models